-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Sframe header #151219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Sframe header #151219
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
) This prints the prefix on every new line, allowing for an output that looks like: ``` [dead-code-analysis] DeadCodeAnalysis.cpp:288 Visiting operation: func.func private @private_1() -> (i32, i32) { [dead-code-analysis] DeadCodeAnalysis.cpp:288 %c0_i32 = arith.constant 0 : i32 [dead-code-analysis] DeadCodeAnalysis.cpp:288 %0 = arith.addi %c0_i32, %c0_i32 {tag = "one"} : i32 [dead-code-analysis] DeadCodeAnalysis.cpp:288 return %c0_i32, %0 : i32, i32 [dead-code-analysis] DeadCodeAnalysis.cpp:288 } [dead-code-analysis] DeadCodeAnalysis.cpp:313 Visiting callable operation: func.func private @private_1() -> (i32, i32) { [dead-code-analysis] DeadCodeAnalysis.cpp:313 %c0_i32 = arith.constant 0 : i32 [dead-code-analysis] DeadCodeAnalysis.cpp:313 %0 = arith.addi %c0_i32, %c0_i32 {tag = "one"} : i32 [dead-code-analysis] DeadCodeAnalysis.cpp:313 return %c0_i32, %0 : i32, i32 [dead-code-analysis] DeadCodeAnalysis.cpp:313 } ```
…clone" (llvm#150856) Reverts llvm#150735 due to bot failures that I need to investigate
…vm#149836) Microbenchmarking shows this is faster
When Mips process emitStartOfAsmFile and updateABIInfo, it did not know the real value of IsSoftFloat and STI.useSoftFloat(). And when inline asm instruction was empty, Mips did not process asm parser, so it would not do TS.updateABIInfo(STI) again and at this time the value of IsSoftFloat is correct. Fix llvm#135283.
This is a follow up of 9e79991. It attempts to fix the CI flakes we saw on fuchsia-linux-x64 builder.
Tracked at llvm#112294 This patch implements from [basic.link]p14 to [basic.link]p18 partially. The explicitly missing parts are: - Anything related to specializations. - Decide if a pointer is associated with a TU-local value at compile time. - [basic.link]p15.1.2 to decide if a type is TU-local. - Diagnose if TU-local functions from other TU are collected to the overload set. See [basic.link]p19, the call to 'h(N::A{});' in translation unit llvm#2 There should be other implicitly missing parts as the wording uses "names" briefly several times. But to implement this precisely, we have to visit the whole AST, including Decls, Expression and Types, which may be harder to implement and be more time-consuming for compilation time. So I choose to implement the common parts. It won't be too bad to miss some cases since we DIDN'T do any such checks in the past 3 years. Any new check is an improvement. Given modules have been basically available since clang15 without such checks, it will be user unfriendly if we give a hard error now. And there are a lot of cases which violating the rule actually just fine. So I decide to emit it as warnings instead of hard errors.
* Replace undef -> poison * Remove overloaded type in intrinsic signature
Vector costs without zvfhmin/zvfbfmin and zfhmin/zfbfmin are somehow cheaper than with zvfhmin/zvfbfmin at smaller vector sizes, despite the fact that the former are scalarized to libcalls. This adds a RUN line to showcase this, splitting out the bfloat tests into their own functions so we don't have duplicate lines for the regular float/double costs.
…scv-target-features-sifive.c. NFC.
…v-target-features-cv.c. NFC.
…iscv-target-features-thead.c. NFC.
…zers` check (llvm#150842) Resolves llvm#150782.
same as llvm#133050 Co-authored-by : Oke, Akshat <[[email protected]](mailto:[email protected])>
…vm#150883) Reverts llvm#148114 will update with fixed PR.
If we have instructions in second loop's preheader which can be sunk, we should also be adjusting PHI nodes to receive values from the fused loop's latch block. Fixes llvm#128600
Those are metadata sections for ELF but was not properly set for COFF.
I'm just trying to more consistently use CHECK-SD and CHECK-GI.
This relands commit llvm@7355ea3. The original commit was reverted in llvm@bfd73a5 because it was breaking the buildbot. The issue has now been resolved by llvm@38f8253. Original PR: llvm#139348 Original commit message: <blockquote> closes clangd/clangd#1037 closes clangd/clangd#2240 Example: ```c++ class Base { public: virtual void publicMethod() = 0; protected: virtual auto privateMethod() const -> int = 0; }; // Before: // // cursor here class Derived : public Base{}^ ; // After: class Derived : public Base { public: void publicMethod() override { // TODO: Implement this pure virtual method. static_assert(false, "Method `publicMethod` is not implemented."); } protected: auto privateMethod() const -> int override { // TODO: Implement this pure virtual method. static_assert(false, "Method `privateMethod` is not implemented."); } }; ``` https://github.com/user-attachments/assets/79de40d9-1004-4c2e-8f5c-be1fb074c6de </blockquote>
This effectively reverts a4d4859 which didn't fix the problem that `int*,` was not counted as "Left" alignment. Fixes llvm#150327
These float operations were expanded for scalar f32/f64/f128, but not for f16 and more problematically, not for vectors. A small subset of them was separately set to expand for vectors. Change these to always expand by default, and adjust targets to mark these as legal where necessary instead. This is a much safer default, and avoids unnecessary legalization failures because a target failed to manually mark them as expand. Fixes llvm#110753. Fixes llvm#121390.
This is a nomination for the maintainers of the core category within MLIR as proposed in https://discourse.llvm.org/t/mlir-project-maintainers/87189. As agreed in the Project Council meeting on July 17, we are proceeding with category nominations without waiting for lead maintainers to be nominated.
) In llvm#149156, I ensured that we no longer generate spurious `tensor.empty` ops when vectorizing `linalg.unpack`. This follow-up removes leftover code that is now redundant but was missed in the original PR.
This is a nomination for the maintainers of the tensor compiler category within MLIR as proposed in https://discourse.llvm.org/t/mlir-project-maintainers/87189. As agreed in the Project Council meeting on July 17, we are proceeding with category nominations without waiting for lead maintainers to be nominated.
This isn't needed after we set the tail folding style to data-with-evl via TTI in llvm#148686. Also rename the tests to reflect the fact they're no longer forcing the tail folding style.
Those sections are generated by -fembed-bitcode and do not need to be kept in executable files.
The SetThreadInformation API allows threads to be scheduled on the most efficient cores on the most efficient frequency. Using this API for ThreadPriority::Background should make clangd-based IDEs a little less CPU hungry. --------- Co-authored-by: Alexandre Ganea <[email protected]>
This PR adds support for `-ffine-grained-bitfield-accesses`. I reused the tests from classic CodeGen, available here: [https://github.com/llvm/llvm-project/blob/c2c881fcc85e0c2d7a050b0199d4dadf8f556b9e/clang/test/CodeGenCXX/finegrain-bitfield-access.cpp](https://github.com/llvm/llvm-project/blob/c2c881fcc85e0c2d7a050b0199d4dadf8f556b9e/clang/test/CodeGenCXX/finegrain-bitfield-access.cpp) We produce almost exactly the same codegen, except when returning a variable: we emit an extra variable to hold the return value, whereas classic CodeGen does not. Also, the GEP instructions use slightly different syntax compared to classic CodeGen.
## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates symbols that were recently added to LLVM without proper annotations. The annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in llvm#109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). ## Overview The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS: - Add `LLVM_EXPORT_TEMPLATE` and `LLVM_TEMPLATE_ABI` annotations to explicitly instantiated instances of `llvm::object::SFrameParser`. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang
This patch adds a new variety of driver to Dexter, allowing it to work with DAP-based interfaces for debuggers. The first concrete instance of this is implemented in this patch, adding support for an `lldb-dap` debugger. This is functionally very similar to the existing LLDB debugger support*, but uses lldb-dap as its executable instead of lldb. This has been tested successfully against the existing feature_test suite, and manually tested against some other inputs; support is essentially complete, although any further DAP-based debuggers may require additional hooks inserted into the base class to deal with any idiosyncrasies they exhibit (as with the several that have been inserted for lldb-dap). NB: There are some small differences resulting from differences between lldb-dap's use of the lldb API and Dexter's use in its lldb driver; one small example of this is when evaluating variables, lldb-dap will try to first use `GetValueForVariablePath` and fallback to `EvaluateExpression` if necessary, while Dexter will always use `EvaluateExpression`; these can give slightly different results, resulting in different output from Dexter for the same input.
llvm#150443) Update docs to state that reduction is supported on OpenMP `loop` and `teams` standalone and compound constructs.
This PR introduces the initial version of a C++ framework for the conformance testing of GPU math library functions, building upon the skeleton provided in llvm#146391. The main goal of this framework is to systematically measure the accuracy of math functions in the GPU libc, verifying correctness or at least conformance to standards like OpenCL via exhaustive or random accuracy tests.
This adds standard-comforming handling for calls to functions that were declared in C source in the no prototype form.
This PR implements fabsbf16 math function for BFloat16 type along with the tests. --------- Signed-off-by: krishna2803 <[email protected]> Signed-off-by: Krishna Pandey <[email protected]> Co-authored-by: OverMighty <[email protected]>
This adds support for array cleanups, including the ArrayDtor op.
My aim here is to make these a little easier to maintain by relying on aliases where these instructions overlap with the Hint instructions they are based on. The following instructions have not been converted to aliases as they have complex mappings from ther immediate encodings to the immediate encoding of the underlying instruction (setting high bits): - qc.pputci - qc.sync, qc.sync, qc.syncwf, qc.syncwl - qc.c.sync, qc.c.syncr, qc.c.syncwf, qc.syncwl Co-authored-by: Sudharsan Veeravalli <[email protected]>
…1182) I was observing segfaults at executable exit in the rtsan instrumented unit tests. Bisecting the offending test led to observing that this test is not using our safe test fixture for anything involving a file descriptor. Changing to use the fixture eliminated the segfault on exit.
…48713) `LocationDescription` contains both the insertion point and the debug location. When `LocationDescription` is available, it is better to use `updateToLocation` which will update both. This PR replaces `restoreIP(Loc.IP)` with `updateToLocation(Loc)` as former may not update debug location in all cases. I am not checking the return value of `updateToLocation` because that is checked just a few lines above in all cases and we would have returned early if it failed.
Added crash on nullptr to mbstowcs --------- Co-authored-by: Sriya Pratipati <[email protected]>
…sts (llvm#150867) There is a pattern that rewrites elementwise_op(broadcast(x1 : T to U), broadcast(x2 : T to U), ...) to broadcast(elementwise_op(x1, x2, ...) : T to U). This pattern did not, however, account for the case where a broadcast constant is represented as a SplatElementsAttr, which can safely be reshaped or scalarized but is not a `vector.broadcast` or `vector.splat` operation. This patch fixes this oversight, prenting premature broadcasting. This did result in the need to update some linalg dialect tests, which now feature a less-broadcast computation and/or more constant folding.
- BUF/FLAT/GLOBAL_ADD/MIN/MAX_F64 - DS_ADD_F64 Co-authored-by: Konstantin Zhuravlyov <Konstantin [email protected]>
This change implements correct lowering of function aliases to the LLVM dialect.
…r convs (llvm#149576) This PR fixes the computation of padded shapes for convolution-style affine maps (e.g., d0 + d1) in `PadTilingInterface`. Previously, the codes used the direct sum of loop upper bounds, leading to over-padding. For example, the following `conv_2d_nhwc_fhwc` op, if only padding the c dimensions to multiples of 16, it also incorrectly pads the convolved dimensions and generates the wrong input shape as: ``` %padded = tensor.pad %arg0 low[0, 0, 0, 0] high[0, 1, 1, 12] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<1x16x16x4xf32> to tensor<1x17x17x16xf32> %padded_0 = tensor.pad %arg1 low[0, 0, 0, 0] high[0, 0, 0, 12] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<16x3x3x4xf32> to tensor<16x3x3x16xf32> %0 = linalg.conv_2d_nhwc_fhwc {dilations = dense<1> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%padded, %padded_0 : tensor<1x17x17x16xf32>, tensor<16x3x3x16xf32>) outs(%arg2 : tensor<1x14x14x16xf32>) -> tensor<1x14x14x16xf32> return %0 : tensor<1x14x14x16xf32> ``` The new implementation uses the maximum accessed index as the input for affine map and then adds 1 after aggregating all the terms to get the final padded size. This fixed llvm#148679.
…m#149637) The Cygwin target is generally very similar to the MinGW target. The default auto-import behavior, the default calling convention, the `.dll.a` import library extension, the `__GXX_TYPEINFO_EQUALITY_INLINE` pre-define by `g++`, and the long double configuration. Co-authored-by: Mateusz Mikuła <[email protected]>
…m#151042) D16 pesudo instructions are introduced in true16 mode to represet a D16 load/store. In MC lowering, the pesudo instructions are lowered to the corresponding D16 Lo/Hi MC Inst respecting the register allocation. However, the pesudo instruction has size 0 and cause an issue in the Inst size estimation. Use D16 Lo when calculating inst size
llvm#149614) When OpenACC is enabled and Fortran loops are annotated with `acc loop`, they are lowered to `acc.loop` operation. And rest of the contained loops use the normal FIR lowering path. Hovever, the OpenACC specification has special provisions related to contained loops and their induction variable. In order to adhere to this, we convert all valid contained loops to `acc.loop` in order to store this information appropriately. The provisions in the spec that motivated this change (line numbers are from OpenACC 3.4): - 1353 Loop variables in Fortran do statements within a compute construct are predetermined to be private to the thread that executes the loop. - 3783 When do concurrent appears without a loop construct in a kernels construct it is treated as if it is annotated with loop auto. If it appears in a parallel construct or an accelerator routine then it is treated as if it is annotated with loop independent. By valid loops - we convert do loops and do concurrent loops which have induction variable. Loops which are unstructured are not handled.
Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet. ``` (lldb) process connect --plugin wasm connect://localhost:4567 Process 1 stopped * thread llvm#1, name = 'nobody', stop reason = trace frame #0: 0x40000000000001ad wasm32_args.wasm`main: -> 0x40000000000001ad <+3>: global.get 0 0x40000000000001b3 <+9>: i32.const 16 0x40000000000001b5 <+11>: i32.sub 0x40000000000001b6 <+12>: local.set 0 (lldb) b add Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c (lldb) c Process 1 resuming Process 1 stopped * thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1 frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 1 int 2 add(int a, int b) 3 { -> 4 return a + b; 5 } 6 7 int (lldb) bt * thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1 * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 frame llvm#1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12 frame llvm#2: 0x40000000000001fe wasm32_args.wasm ``` This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html
…library (llvm#151183) And prune deps when splitting
…, X, and(B,C)) (llvm#141733) ## Description <!--- Title/Description will be Subject/Body of commit message. --> <!--- Please be concise and limit the subject line to 50 characters, --> <!--- and wrap the Description at 72 characters. --> <!--- Describe why this is required, what problem it solves. --> Adds support for ternary equivalent operations of the form `ternary(A, X, and(B,C))` where `X=[xor(B,C)| nor(B,C)| eqv(B,C)| not(B)| not(C)]`. List of `xxeval` equivalent ternary operations added and the corresponding `imm` value required: Ternary Operator| Imm Value --|-- ternary(A, xor(B,C), and(B,C)) | 22 ternary(A, nor(B,C), and(B,C)) | 24 ternary(A, eqv(B,C), and(B,C)) | 25 ternary(A, not(C), and(B,C)) | 26 ternary(A, not(B), and(B,C)) | 28 eg. `xxeval XT,XA,XB,XC,22` - performs `XA ? xor(XB, XC) : and(XB,XC)`and places the result in `XT`. Co-authored-by: Tony Varghese <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.